Search CORE

43 research outputs found

Wiretap and Gelfand-Pinsker Channels Analogy and its Applications

Author: Goldfeld Ziv
Permuter Haim. H.
Publication venue
Publication date: 01/02/2018
Field of study

An analogy framework between wiretap channels (WTCs) and state-dependent point-to-point channels with non-causal encoder channel state information (referred to as Gelfand-Pinker channels (GPCs)) is proposed. A good sequence of stealth-wiretap codes is shown to induce a good sequence of codes for a corresponding GPC. Consequently, the framework enables exploiting existing results for GPCs to produce converse proofs for their wiretap analogs. The analogy readily extends to multiuser broadcasting scenarios, encompassing broadcast channels (BCs) with deterministic components, degradation ordering between users, and BCs with cooperative receivers. Given a wiretap BC (WTBC) with two receivers and one eavesdropper, an analogous Gelfand-Pinsker BC (GPBC) is constructed by converting the eavesdropper's observation sequence into a state sequence with an appropriate product distribution (induced by the stealth-wiretap code for the WTBC), and non-causally revealing the states to the encoder. The transition matrix of the state-dependent GPBC is extracted from WTBC's transition law, with the eavesdropper's output playing the role of the channel state. Past capacity results for the semi-deterministic (SD) GPBC and the physically-degraded (PD) GPBC with an informed receiver are leveraged to furnish analogy-based converse proofs for the analogous WTBC setups. This characterizes the secrecy-capacity regions of the SD-WTBC and the PD-WTBC, in which the stronger receiver also observes the eavesdropper's channel output. These derivations exemplify how the wiretap-GP analogy enables translating results on one problem into advances in the study of the other

arXiv.org e-Print Archive

Repository for Publications and Research Data

Information Storage in the Stochastic Ising Model

Author: Bresler Guy
Goldfeld Ziv
Polyanskiy Yury
Publication venue
Publication date: 08/05/2018
Field of study

Most information systems store data by modifying the local state of matter, in the hope that atomic (or sub-atomic) local interactions would stabilize the state for a sufficiently long time, thereby allowing later recovery. In this work we initiate the study of information retention in locally-interacting systems. The evolution in time of the interacting particles is modeled via the stochastic Ising model (SIM). The initial spin configuration

X_0

serves as the user-controlled input. The output configuration

X_t

is produced by running

t

steps of the Glauber chain. Our main goal is to evaluate the information capacity

I_n(t)\triangleq\max_{p_{X_0}}I(X_0;X_t)

when the time

t

scales with the size of the system

n

. For the zero-temperature SIM on the two-dimensional

\sqrt{n}\times\sqrt{n}

grid and free boundary conditions, it is easy to show that

I_n(t) = \Theta(n)

for

t=O(n)

. In addition, we show that on the order of

\sqrt{n}

bits can be stored for infinite time in striped configurations. The

\sqrt{n}

achievability is optimal when

t\to\infty

and

n

is fixed. One of the main results of this work is an achievability scheme that stores more than

\sqrt{n}

bits (in orders of magnitude) for superlinear (in

n

) times. The analysis of the scheme decomposes the system into

\Omega(\sqrt{n})

independent Z-channels whose crossover probability is found via the (recently rigorously established) Lifshitz law of phase boundary movement. We also provide results for the positive but small temperature regime. We show that an initial configuration drawn according to the Gibbs measure cannot retain more than a single bit for

t\geq e^{cn^{\frac{1}{4}+\epsilon}}

. On the other hand, when scaling time with

\beta

, the stripe-based coding scheme (that stores for infinite time at zero temperature) is shown to retain its bits for time that is exponential in

\beta

arXiv.org e-Print Archive

DSpace@MIT

Convergence of Smoothed Empirical Measures with Applications to Entropy Estimation

Author: Goldfeld Ziv
Greenewald Kristjan
Polyanskiy Yury
Weed Jonathan
Publication venue
Publication date: 01/05/2020
Field of study

This paper studies convergence of empirical measures smoothed by a Gaussian kernel. Specifically, consider approximating

P\ast\mathcal{N}_\sigma

, for

\mathcal{N}_\sigma\triangleq\mathcal{N}(0,\sigma^2 \mathrm{I}_d)

, by

\hat{P}_n\ast\mathcal{N}_\sigma

, where

\hat{P}_n

is the empirical measure, under different statistical distances. The convergence is examined in terms of the Wasserstein distance, total variation (TV), Kullback-Leibler (KL) divergence, and

\chi^2

-divergence. We show that the approximation error under the TV distance and 1-Wasserstein distance (

\mathsf{W}_1

) converges at rate

e^{O(d)}n^{-\frac{1}{2}}

in remarkable contrast to a typical

n^{-\frac{1}{d}}

rate for unsmoothed

\mathsf{W}_1

(and

d\ge 3

). For the KL divergence, squared 2-Wasserstein distance (

\mathsf{W}_2^2

), and

\chi^2

-divergence, the convergence rate is

e^{O(d)}n^{-1}

, but only if

P

achieves finite input-output

\chi^2

mutual information across the additive white Gaussian noise channel. If the latter condition is not met, the rate changes to

\omega(n^{-1})

for the KL divergence and

\mathsf{W}_2^2

, while the

\chi^2

-divergence becomes infinite - a curious dichotomy. As a main application we consider estimating the differential entropy

h(P\ast\mathcal{N}_\sigma)

in the high-dimensional regime. The distribution

P

is unknown but

n

i.i.d samples from it are available. We first show that any good estimator of

h(P\ast\mathcal{N}_\sigma)

must have sample complexity that is exponential in

d

. Using the empirical approximation results we then show that the absolute-error risk of the plug-in estimator converges at the parametric rate

e^{O(d)}n^{-\frac{1}{2}}

, thus establishing the minimax rate-optimality of the plug-in. Numerical results that demonstrate a significant empirical superiority of the plug-in approach to general-purpose differential entropy estimators are provided.Comment: arXiv admin note: substantial text overlap with arXiv:1810.1158

arXiv.org e-Print Archive

DSpace@MIT

Capacity of Continuous Channels with Memory via Directed Information Neural Estimator

Author: Aharoni Ziv
Goldfeld Ziv
Permuter Haim Henry
Tsur Dor
Publication venue
Publication date: 16/05/2020
Field of study

Calculating the capacity (with or without feedback) of channels with memory and continuous alphabets is a challenging task. It requires optimizing the directed information (DI) rate over all channel input distributions. The objective is a multi-letter expression, whose analytic solution is only known for a few specific cases. When no analytic solution is present or the channel model is unknown, there is no unified framework for calculating or even approximating capacity. This work proposes a novel capacity estimation algorithm that treats the channel as a `black-box', both when feedback is or is not present. The algorithm has two main ingredients: (i) a neural distribution transformer (NDT) model that shapes a noise variable into the channel input distribution, which we are able to sample, and (ii) the DI neural estimator (DINE) that estimates the communication rate of the current NDT model. These models are trained by an alternating maximization procedure to both estimate the channel capacity and obtain an NDT for the optimal input distribution. The method is demonstrated on the moving average additive Gaussian noise channel, where it is shown that both the capacity and feedback capacity are estimated without knowledge of the channel transition kernel. The proposed estimation framework opens the door to a myriad of capacity approximation results for continuous alphabet channels that were inaccessible until now

arXiv.org e-Print Archive

Crossref

Max-Sliced Mutual Information

Author: Goldfeld Ziv
Greenewald Kristjan
Tsur Dor
Publication venue
Publication date: 28/09/2023
Field of study

Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference. Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure that also captures high-order dependencies. However, CCA only accounts for linear dependence, which may be insufficient for certain applications, while mutual information is often infeasible to compute/estimate in high dimensions. This work proposes a middle ground in the form of a scalable information-theoretic generalization of CCA, termed max-sliced mutual information (mSMI). mSMI equals the maximal mutual information between low-dimensional projections of the high-dimensional variables, which reduces back to CCA in the Gaussian case. It enjoys the best of both worlds: capturing intricate dependencies in the data while being amenable to fast computation and scalable estimation from samples. We show that mSMI retains favorable structural properties of Shannon's mutual information, like variational forms and identification of independence. We then study statistical estimation of mSMI, propose an efficiently computable neural estimator, and couple it with formal non-asymptotic error bounds. We present experiments that demonstrate the utility of mSMI for several tasks, encompassing independence testing, multi-view representation learning, algorithmic fairness, and generative modeling. We observe that mSMI consistently outperforms competing methods with little-to-no computational overhead.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive